172 research outputs found

    Dual Correction Strategy for Ranking Distillation in Top-N Recommender System

    Full text link
    Knowledge Distillation (KD), which transfers the knowledge of a well-trained large model (teacher) to a small model (student), has become an important area of research for practical deployment of recommender systems. Recently, Relaxed Ranking Distillation (RRD) has shown that distilling the ranking information in the recommendation list significantly improves the performance. However, the method still has limitations in that 1) it does not fully utilize the prediction errors of the student model, which makes the training not fully efficient, and 2) it only distills the user-side ranking information, which provides an insufficient view under the sparse implicit feedback. This paper presents Dual Correction strategy for Distillation (DCD), which transfers the ranking information from the teacher model to the student model in a more efficient manner. Most importantly, DCD uses the discrepancy between the teacher model and the student model predictions to decide which knowledge to be distilled. By doing so, DCD essentially provides the learning guidance tailored to "correcting" what the student model has failed to accurately predict. This process is applied for transferring the ranking information from the user-side as well as the item-side to address sparse implicit user feedback. Our experiments show that the proposed method outperforms the state-of-the-art baselines, and ablation studies validate the effectiveness of each component

    Understanding Grip Shifts: How Form Factors Impact Hand Movements on Mobile Phones

    Get PDF
    In this paper we present an investigation into how hand usage is affected by different mobile phone form factors. Our initial (qualitative) study explored how users interact with various mobile phone types (touchscreen, physical keyboard and stylus). The analysis of the videos revealed that each type of mobile phone affords specific handgrips and that the user shifts these grips and consequently the tilt and rotation of the phone depending on the context of interaction. In order to further investigate the tilt and rotation effects we conducted a controlled quantitative study in which we varied the size of the phone and the type of grips (Symmetric bimanual, Asymmetric bimanual with finger, Asymmetric bimanual with thumb and Single handed) to better understand how they affect the tilt and rotation during a dual pointing task. The results showed that the size of the phone does have a consequence and that the distance needed to reach action items affects the phones’ tilt and rotation. Additionally, we found that the amount of tilt, rotation and reach required corresponded with the participant’s grip preference. We finish the paper by discussing the design lessons for mobile UI and proposing design guidelines and applications for these insights

    Exploration in Gradient-Based Reinforcement Learning

    Get PDF
    Gradient-based policy search is an alternative to value-function-based methods for reinforcement learning in non-Markovian domains. One apparent drawback of policy search is its requirement that all actions be 'on-policy'; that is, that there be no explicit exploration. In this paper, we provide a method for using importance sampling to allow any well-behaved directed exploration policy during learning. We show both theoretically and experimentally that using this method can achieve dramatic performance improvements

    Understanding face and eye visibility in front-facing cameras of smartphones used in the wild

    Get PDF
    Commodity mobile devices are now equipped with high-resolution front-facing cameras, allowing applications in biometrics (e.g., FaceID in the iPhone X), facial expression analysis, or gaze interaction. However, it is unknown how often users hold devices in a way that allows capturing their face or eyes, and how this impacts detection accuracy. We collected 25,726 in-the-wild photos, taken from the front-facing camera of smartphones as well as associated application usage logs. We found that the full face is visible about 29% of the time, and that in most cases the face is only partially visible. Furthermore, we identified an influence of users' current activity; for example, when watching videos, the eyes but not the entire face are visible 75% of the time in our dataset. We found that a state-of-the-art face detection algorithm performs poorly against photos taken from front-facing cameras. We discuss how these findings impact mobile applications that leverage face and eye detection, and derive practical implications to address state-of-the art's limitations

    Adapting Text-based Dialogue State Tracker for Spoken Dialogues

    Full text link
    Although there have been remarkable advances in dialogue systems through the dialogue systems technology competition (DSTC), it remains one of the key challenges to building a robust task-oriented dialogue system with a speech interface. Most of the progress has been made for text-based dialogue systems since there are abundant datasets with written corpora while those with spoken dialogues are very scarce. However, as can be seen from voice assistant systems such as Siri and Alexa, it is of practical importance to transfer the success to spoken dialogues. In this paper, we describe our engineering effort in building a highly successful model that participated in the speech-aware dialogue systems technology challenge track in DSTC11. Our model consists of three major modules: (1) automatic speech recognition error correction to bridge the gap between the spoken and the text utterances, (2) text-based dialogue system (D3ST) for estimating the slots and values using slot descriptions, and (3) post-processing for recovering the error of the estimated slot value. Our experiments show that it is important to use an explicit automatic speech recognition error correction module, post-processing, and data augmentation to adapt a text-based dialogue state tracker for spoken dialogue corpora.Comment: 8 pages, 5 figures, Accepted at the DSTC 11 Workshop to be located at SIGDIAL 202
    • …
    corecore